Filing Date: August 18, 2003

Title: DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

## **IN THE CLAIMS**

Page 3

Dkt: 1376.699US1

Please amend the claims as follows:

1. (Previously Presented) In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising:

sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit;

predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed;

dispatching the predispatched vector instruction if all required operands are ready; and executing the dispatched vector instruction as a function of the scalar operand.

- 2. (Previously Presented) The method according to claim 1, wherein executing the dispatched vector instruction includes translating an address associated with the vector instruction and trapping on a translation fault.
- 3. (Previously Presented) In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising:

sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit;

predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed;

dispatching the predispatched vector instruction if all required operands are ready; generating an address for a vector load; issuing a vector load request to memory; receiving vector data from memory;

storing the vector data in a load buffer;

transferring the vector data from the load buffer to a vector register; and executing the dispatched vector instruction on the vector data stored in the vector register.

- 4. (Original) The method according to claim 3, wherein the vector processing unit includes a vector execute unit and a vector load/store unit, wherein issuing a vector load request to memory includes issuing and executing vector memory references in the vector load/store unit when the vector load store unit has received the instruction and memory operands from the scalar processing unit.
- 5. (Previously Presented) In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising:

sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit;

predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed;

Page 5 Dkt: 1376.699US1 Serial Number: 10/643,586

Filing Date: August 18, 2003

DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended) Title:

dispatching the predispatched vector instruction if all required operands are ready; generating a first and a second address for a vector load; issuing first and second vector load requests to memory; receiving vector data associated with the first and second addresses from memory; storing vector data associated with the first address in a first vector register; storing vector data associated with the second address in a second vector register; executing a vector instruction on the vector data stored in the first vector register; renaming the second vector register; and

executing the dispatched vector instruction on the vector data stored in the second vector register.

- 6. (Previously Presented) The method according to claim 5, wherein the vector processing unit includes a vector execute unit and a vector load/store unit, wherein issuing a vector load request to memory includes issuing and executing vector memory references in the vector load/store unit when the vector load store unit has received the instruction and memory operands from the scalar processing unit.
- 7. (Previously Presented) A computer system, comprising:
  - a scalar processing unit; and
- a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a vector execute unit and a vector load/store unit;

wherein the scalar processing unit sends a vector instruction and a scalar operand to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

wherein the vector dispatch unit predispatches the vector instruction within the vector dispatch unit if the vector instruction is scalar committed and then dispatches the predispatched vector instruction to one or more of the vector execute unit and the vector load/store unit if all required operands are ready;

DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

wherein the vector load/store unit receives an instruction and memory operands from the scalar processing unit, issues and executes a vector memory load reference as a function of the instruction and the memory operands received from the scalar processing unit, and stores data received as a result of the vector memory reference in a load buffer; and

wherein the vector execute unit issues the vector memory load instruction and transfers the data received as a result of the vector memory reference from the load buffer to a vector register.

(Previously Presented) In a computer system having a scalar processing unit and a 8. vector processing unit, a method of decoupling scalar and vector execution, comprising:

dispatching scalar instructions to a scalar instruction queue;

sending a vector instruction that requires scalar operands to the scalar instruction queue and to a vector instruction queue, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

executing the vector instruction in the scalar processing unit, wherein executing the vector instruction in the scalar processing unit includes writing a scalar operand to a scalar operand queue;

predispatching the vector instruction sent to the vector instruction queue if the vector instruction is scalar committed:

notifying the vector processing unit that the scalar operand is available in the scalar operand queue;

dispatching the predispatched vector instruction if all required scalar operands are ready; and

executing the dispatched vector instruction in the vector processing unit, wherein executing the vector instruction in the vector processing unit includes reading the scalar operand from the scalar operand queue.

Page 7 Dkt: 1376.699US1

Serial Number: 10/643,586

Filing Date: August 18, 2003

Title: DECOUPLED SCALA

DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

9. (Previously Presented) In a computer system having a scalar processing unit and a vector processing unit, a method of decoupling a vector memory reference and a vector execution, comprising:

dispatching scalar instructions to a scalar instruction queue;

sending a vector instruction to the scalar instruction queue and to a vector instruction queue, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

executing the vector instruction in the scalar processing unit, wherein executing the vector instruction in the scalar processing unit includes generating an address and writing the address to a scalar operand queue;

predispatching the vector instruction sent to the vector instruction queue if the vector instruction is scalar committed;

notifying the vector processing unit that the address is available in the scalar operand queue;

dispatching the predispatched vector instruction if all required operands are ready; and executing the dispatched vector instruction in the vector processing unit, wherein executing the vector instruction in the vector processing unit includes reading the address from the scalar operand queue and generating a memory request as a function of the address read from the scalar operand queue.

10. (Previously Presented) In a computer system having a scalar processing unit and a vector processing unit, a method of executing a vector instruction, comprising:

dispatching scalar instructions to a scalar instruction queue;

sending a vector instruction to the scalar instruction queue and to a vector instruction queue, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

Dkt: 1376.699US1

DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

executing the vector instruction in the scalar processing unit, wherein executing the vector instruction in the scalar processing unit includes generating an address and writing the address to a scalar operand queue;

predispatching the vector instruction sent to the vector instruction queue if the vector instruction is scalar committed;

notifying the vector processing unit that the address is available in the scalar operand queue;

dispatching the predispatched vector instruction if all required operands are ready; and executing the dispatched vector instruction in the vector processing unit, wherein executing the dispatched vector instruction in the vector processing unit includes:

reading the address from the scalar operand queue;

generating a memory request as a function of the address read from the scalar operand queue;

receiving vector data from memory; storing the vector data in a load buffer; transferring the vector data from the load buffer to a vector register; and executing a vector instruction on the vector data stored in the vector register.

(Currently Amended) In a computer system having a scalar processing unit and a vector 11. processing unit, a method of unrolling a loop, comprising:

preparing a first and a second vector instruction, wherein each vector instruction execute an iteration through the loop and wherein each vector instruction requires calculation of a scalar loop value;

sending the first and second vector instructions to the scalar instruction queue and to a vector instruction queue, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands;

executing a portion of each vector instruction in the scalar processing unit, wherein executing a portion of each vector instruction in the scalar processing unit includes writing a

scalar operand representing the scalar loop value calculated for each vector instruction to a scalar operand queue;

predispatching each vector instruction sent to the vector instruction queue if the vector instruction is scalar committed;

notifying the vector processing unit that the scalar operand is available in the scalar operand queue;

dispatching the predispatched vector instruction if all required operands are ready; and executing the first and second dispatched vector instructions in the vector processing unit, wherein executing the dispatched vector instruction in the vector processing unit includes reading the scalar operands associated with each instruction from the scalar operand queue.

- 12. (Previously Presented) The method according to claim 3, wherein storing the vector data in a load buffer and transferring the vector data from the load buffer to a vector register are decoupled from each other.
- 13. (Previously Presented) The method according to claim 3, wherein storing the vector data in a load buffer includes writing memory load data to the load buffer until all previous memory operations complete without fault.
- 14. (Previously Presented) The method according to claim 4, wherein storing the vector data in a load buffer and transferring the vector data from the load buffer to a vector register are decoupled from each other.
- 15. (Previously Presented) The method according to claim 4, wherein storing the vector data in a load buffer includes writing memory load data to the load buffer until all previous memory operations complete without fault.
- (Previously Presented) The system according to claim 7, wherein the load buffer stores 16. memory load data until it is determined that no previous memory operation will fail and, if no previous memory operations have failed, the load buffer transfers the data to the vector register.

AMENDMENT AND RESPONSE UNDER 37 CFR § 1.116 – EXPEDITED PROCEDURE

Serial Number: 10/643,586

Filing Date: August 18, 2003

DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

(New) The method according to claim 1, wherein dispatching the predispatched vector 17.

instruction includes marking the vector instruction as complete if the vector instruction is not a

Page 10 Dkt: 1376.699US1

memory instruction but requires scalar operands and wherein executing the dispatched vector

instruction includes marking the vector instruction as complete after vector address translation if

the vector instruction is a memory instruction.

(New) The method according to claim 3, wherein dispatching the predispatched vector 18.

instruction includes marking the vector instruction as complete if the vector instruction is not a

memory instruction but requires scalar operands and wherein executing the dispatched vector

instruction includes marking the vector instruction as complete after vector address translation if

the vector instruction is a memory instruction.

19. (New) The method according to claim 5, wherein dispatching the predispatched vector

instruction includes marking the vector instruction as complete if the vector instruction is not a

memory instruction but requires scalar operands and wherein generating the first and the second

address for the vector load includes marking the vector instruction as complete after vector

address translation.

20. (New) The system according to claim 7, wherein dispatching the predispatched vector

instruction includes marking the vector instruction as complete if the vector instruction is not a

memory instruction but requires scalar operands and wherein executing the vector memory load

reference includes marking the vector instruction as complete after vector address translation.

21. (New) The method according to claim 8, wherein dispatching the predispatched vector

instruction includes marking the vector instruction as complete if the vector instruction is not a

memory instruction but requires scalar operands and wherein executing the dispatched vector

instruction further includes marking the vector instruction as complete after vector address

translation if the vector instruction is a memory instruction.

AMENDMENT AND RESPONSE UNDER 37 CFR § 1.116 - EXPEDITED PROCEDURE

AMENDMENT AND RESPONSE UNDER 37 CFR § 1.116 – EXPEDITED P Serial Number: 10/643,586

Filing Date: August 18, 2003

Title: DECOUPLED SCALAR/VECTOR COMPUTER ARCHITECTURE SYSTEM AND METHOD (As Amended)

22. (New) The method according to claim 9, wherein dispatching the predispatched vector instruction includes marking the vector instruction as complete if the vector instruction is not a memory instruction but requires scalar operands and wherein executing the dispatched vector instruction further includes marking the vector instruction as complete after vector address translation if the vector instruction is a memory instruction.

Page 11 Dkt: 1376.699US1

- 23. (New) The method according to claim 10, wherein dispatching the predispatched vector instruction includes marking the vector instruction as complete if the vector instruction is not a memory instruction but requires scalar operands and wherein executing the dispatched vector instruction further includes marking the vector instruction as complete after vector address translation if the vector instruction is a memory instruction.
- 24. (New) The method according to claim 11, wherein dispatching the predispatched vector instruction includes marking the vector instruction as complete if the vector instruction is not a memory instruction but requires scalar operands and wherein executing the dispatched vector instructions further includes marking the vector instructions as complete after vector address translation if the vector instructions are memory instructions.